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© Speech recognition system for natural language translation. 

A soeech recognition system displays a source text of one or more words in a source language. The system 
ht « fSSt^SSS^ for generating a sequence of coded representations of an utterance to be recognized. 
^etLTc^corn^a series of one or more words in a target language different from the source language. 
rL^TnToTmore speech hypotheses, each comprising one or more words from the target language are 
^uced Each Tp^echTpothesTs is modeled with an acoustic model. An acoustic match ^e for e ^h 
KvShesh ^comprises an estimate of the closeness of a match between the acoustic model of the 
sote^ MM sequence of coded representations of the utterance. A translation match score for 

ealhtrS an estimate of the probability of occurrence of the speech hypothesis g.ven 

^foc^nce ^ the sourcTtext A hypothesis score for each hypothesis composes a combination of the 
ITS L>re and the translation match score. At least one word of one or more speech hypotheses 



Q- having the best hypothesis scores is output as a recognition result 
111 
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The invention relates to automatic speech recognition. More specifically, the invent.cn plates to 
auton^ sC£ recognition of an utterance in a target language of a translation of a.source text .n a 
s"ngSag1 di» from the target language. For example, the invention may be used to recognize 
an utterance in Enqlish of a translation of a sentence in French. , - rtM 

~ study^it was found that the efficiency^ a human translator who dictates a; translator, in one 
lanouaoe corresponding to source text in another language, is greater than the effiaency of a human 
2 who ^ or types a translation. (See. for example. "Language and Machines - Computers ,n 

Translation and Linguistics". National Academy of the Sciences, 1966.) 

In one approach to speech recognition.' speech hypotheses are scored us,ng fcro probabihty mode£ 
One moTeUs Tlanguage model which estimates the probability that the speech hypothesis would be 
lr« which ZZ knowledge or information about the actual utterance to ^0™^; ^ 
model s an acoustic model which estimates the probability that an utterance of the speech hypothecs 
wo^d prooCceracousUc signal equal to the acoustic signal produced by the utterance to ^ recogn.zed 

SSficS language models exploit the fact that not. all word . sequences occur naturally with , equal 
proba^rty^ne Iple model is the trigram model of English, in which it is assumed that the P'°bab^ty 
mat a wo«l wm be spoken , depends only on the previous two words that have . been spoken. Trigram 
EnUoe modes an, relatively Jmple to produce, and have proven useful in their ability to predict words as 
^cuM^atu^ anguage. More sophisticated language models based on probabi stic decision trees. 
Chaise grammars, and. automatically discovered classes of words have also been used^ 

, WrX Seal language models which use no knowledge or information about the actual utterance to 
be recreate useful in scoring speech hypotheses in a speech recognition system, the best sconng 
^hypotheses do not always correctly identify the corresponding utterances to be recogn,zed 

Tis aV object of the invention the provide a speech recognition system wh.ch has an .mproved 
lannuane model for increasing the accuracy of speech recognition. 
, H ; £ote, ob£ct ofZo invention the provide a speech recognition system which estimates the 
prob^MitTof oCcunence of each speech hypothesis using additional knowledge or information about the 

^Thfru^y ^sS of a speech recognrtion system depends on a .arge number of factor. 
One^poSnU^ris the^mplexity of the .anguage, as represented by the number of possible ^word 
o sZe^ce^n the tenguage. and the probability of occurrence of each possible word str,ng. If a language 
ZeT^le to reduce the uncertainty, or entropy, of the possible word sequences be,ng recognized, then 
the recoanition result will be more accurate than with higher uncertainty. 

IrTmrs^h^ecognitior. system and method according to the present invention, information about the 
sour!ce sente^ bSTtrans^ is used to estimate the probabiHty that each speech hypothesis would be 
« iittftrAd This Drobabilitv is estimated with the aid of a translation model. 

^ng^ Mention, a speech recognition system comprises means ^^^T, 
The «wrce text comprises one or more words in a source language. An acoustic processor generates a 
Su^cTof^deZpresentations of an utterance to be recognized. The utterance compnsing a senes of 
nr mnra words in a taraet language different from the source language. 
„ TheTr^ ^n,7on sSm farther includes a speech hypothesis generator for produang a seto 
oneTmo^pe^hypotheses. Each speech hypothesis comprises one or more words from the target 
Z ac^c m^e, generator produces an acoustic model of each speech hypo^ 
An acoustic match score generator produces an acoustic match score for each speech hypothesis 
EacTaco^f ma^ch^oTcomprises an estimate of the closeness of a match between the acoustic model 
45 oft^e ^ hy^s and le sequence of coded representations of the utterance produced by the 

^A^ans^Tmatch score generator produces a translation match score for each speech hypothe^ 
Each Satn n^atch score Uprises an estimate of the probability of occurrence of the speech 

comoris^c^mbmation of the acoustic match score and the translation match score for the hypothec* 

finX sS Precognition system includes a memory .or storing a subset of one or "«re spe^h 
hvr^esesTomle seVoT speech hypotheses, having the best hypothes.s scores, and an output tor 
XS at M Z word of onTor more of the speech hypotheses in the subset of speech hypotheses 

55 ^The^^t Orator may comprise, for examp.e. a candidate word generator for produang 
a seTtf ca^ate words. The set of candidate words consists solely of words in the target language , wh«* 
arrpaW M. translations of words in the source text. One or more speech hypotheses are generated 
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solely from words in the set of. candidate words. 

The translation match score for a speech hypothesis may comprise, for example, an estimate of the 
probability of occurrence of the source text given the occurrence of the speech hypothesis, combined with 
an estimate of the probability of occurrence' of the speech hypothesis. The probability of occurrence of the 
source text given the occurrence of the speech hypothesis may comprise, for example, an estimate, for 
each word in the source text, of the probability of the word in the source text given* the occurrence of each 
word in the speech hypothesis. - J 

The acoustic match score may comprise, for example, an estimate of the probability of occurrence of 
the sequence of coded representations of the utterance given the occurrence of the speech hypothesis. The 
hypothesis score may then comprise the product of the acoustic match score multiplied by the translation ■ 
match score. 

The speech recognition system may further comprise a source vocabulary memory storing a source 
vocabulary of words in the source language, and a comparator for comparing each word in the source text 
with each word in the source vocabulary to identify each word in the source text which is not in the source 
vocabulary. An acoustic model generator produces an acoustic model of each word in the source text which 

is not in the source vocabulary- 

Each word in the source text has a spelling comprising one or more letters. Each letter is either upper 
case or lower case. The acoustic model generator produces an acoustic model of. each word in the source 
text which is not in the source vocabulary, and which has an upper case first letter. 

The acoustic model generator may comprise, for example, a memory for storing a plurality of acoustic 
letter models. An acoustic model of a word is then produced by replacing each letter in the spelling of the 
word with an acoustic letter model corresponding to the letter. 

By providing a speech recognition system and method according to the invention with an improved 
language model having a translation model for estimating the probability that eah speech hypothesis would 
be uttered given the occurrence' of the source text, the accuracy and the speed of speech recognition can 
be improved. 

Brief Description of the Drawing 

Fig. 1 is a block diagram of an example of a speech recognition system according to the invention. 
Fig. 2 is a block diagram of a portion of another example of a speech recognition system according 

to the invention. l - 
Fig. 3 is a block diagram of a portion of another example' of a speech recognition system according 

to the invention. 

Rg. 4 is a block diagram of an example of an acoustic processor for a speech recognition system 

. according to the invention. ' ' 

Rg. 5 is a block diagram of an example of an acoustic feature value measure for an acoustic 

processor for a speech recognition system according to the invention. 
Referring to Rgure 1, the speech recognition system comprises a display 10 for displaying a source 
text The source text comprises one or more words in a source language, such as French. The source text 
may be provided to the display by, for exampie. 'a source text input device 12 such as a computer system. 

The speech recognition system further comprises ah acoustic processor 14 for generating a sequence 
of coded representations of an utterance to be recognized. The utterance comprises, for example, a series 
of one or more words in a target language, such as English, different from the source language. 

A speech hypothesis generator 16 generates a set of one or more speech hypotheses. Each speech 
hypothesis comprises one or more words from the target language. For a sentence of, for example, 10 
words out of a target language vocabulary of 20.000 words, there are 20,000 , ° = 1.024 x 10 4 , 3 possible 
hypotheses. 

With such a large number of hypotheses, it is not feasible to generate all possible hypotheses. 
Therefore, preferably, the hypothesis generator does not generate all possible hypotheses for the utterance 
to be recognized. Instead, the hypothesis generator starts by finding a reasonable number of single-word 
hypotheses which are good candidates for a portion of the utterance to be recognized, and systematically 
searches for successively longer word strings which are good candidates for longer portions of the 
utterance to be recognized. One such search algorithm is described, for example, in United States Patent 
4,748.670 entitled "Apparatus And Method For Determining A Ukely Word Sequence From Labels 
Generated By An Acoustic Processor." 

Rgure 2 is a block diagram of a portion of one exampie of a speech recognition system according to 
the invention. In this embodiment, the speech recognition system further comprises a candidate word 
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generator 20 tor generating a set of candidate words, consisting solely of words in the target language which 
are partial or full translations of words in the source text. The candidate word generator 20 receives the 
source text from source text input device 12, and.receives translations of each word in the source text from 
a source-text translation store . 22, From the .source text and from the translations, candidate word generator 
20 generates a set of candidate words consisting solely of words in the target language wh.ch are partial or 
full translations of words in the source text. . 

The set of candidate words is provided to. speech hypothesis generator 16. Preferably, .n this 
embodiment, the speech hypothesis generator 16 generates one or more speech hypotheses solely from 
words in the set of candidate words from candidate word generator 20. 

Returning to Figure 1 , the speech recognition system further comprises an acoustic model generator 18 
for generating an acoustic model for each speech hypothesis generated ,by the speech hypothesis 
generator 16. The acoustic model , generator 18 forms an acoustic niodel of a .speech hypothecs by 
substituting, for each word in the speech hypothesis, an acoustic model of the word from a set of stored 

acoustic models. ... 

The stored acoustic models may be, for example. Markov models or other dynam.c programm.ng type 
models The parameters of the acoustic, Markov models may be estimated from a known uttered framing 
text by! for example, the Forward-Backward Algorithm. (See, for example, L.R Bahl, et al. "A Max.mum 
Likelihood Approach to Continuous Speech .Recognition." IEEE Transactions on Pattern Analysis and 
Machine Intelligence;, Volume PAMI-5. No. 2. pages 179,190, March 1983.) The models may be context- 
independent or contex t-dependent. The models may be built up from submodels of phonemes. 

Context-independent acoustic Markov models may be produced, fpr example, by the method descnbed 
in US. Patent 4.759.068 entitled "Constructing Markov Models of Words From Multiple Utterances, or by^ 
any other known method of generating acoustic word models. , 
For context-dependent acoustic Markov word models, the context can be. for example, manually or 
automatically selected. One method of automatically selecting context is described in European Pa tent 
Application 90 122 396.6, entitled "Apparatus and Method For Grouping Utterances of a Phoneme Into 
Context-Dependent Categories Based on Sound-Similarity For Automatic Speech Recognition. 

An acoustic match score generator 24 generates an acoustic match score for each speech hypothesis 
Each acoustic match score comprises. an estimate .of : the closeness of a match between the acoustic model 
30 of the speech hypothesis and the. sequent of codeUrepresentatio^ of the utterance. ovflmol ^ 
When the acoustic models are Markov models, acoustic match scores may be obtained, for example, 
by the forward pass of the Forward-Backward; Algprithm^X^ee. for example, L.R. Bahl, et al. March 1983, 

° ,t ^As U dis^j > ssed above, the speech hypothesis generator 16 generates hypotheses by finding a reasonable 
number of single-word hypotheses which are good candidates for a portion of the utterance to be 
recognized, and by systematically searching, f or : successively, ipnger word stnngs which are good can- 
didates for longer portions of the utterance to be recognized. 

The acoustic match score generator. 24 preferably generates two types of acoustic match scores. (1) a 
relatively fast, relatively less accurate, { acoustip. match score. £nd (2) a relatively slow, relatively more 
40 accur ate "detailed" acoustic match score. The -fast", match examines at least a portion of every word in the 
target vocabulary to find a number of words which are good, possibilities for extending the candidate word 
strings. The fast match estimates the closeness of a match between an acoustic fast match model of a word 
and a portion of the sequence of coded representations of the, utterance. The "detailed match examines 
only those words which the "fast- match, determines te be good possibilities for extending the candidate 
45 word strings. The "detailed" acoustic match score estimates, the closeness of a match between an acoustic 
detailed match model of a word and the sequence^ coded representations of the utterance. 

Still relerring to Figure 1, the speech recognition system further comprises a translation match score 
generator 26 for generating a translation match score for each speech hypothesis. Each translation match 
score comprises an estimate of the probability of occurrence of the speech hypothesis given the 

50 ^Te^lation mltch^ore generator 26 will now be described. The role of the translation match score 
generator is to compute a translation match score Score(S, T» ) that a finite sequence S of source words .s 
the translation of a sequence of target words beginning with the finite sequence T. Here and in the 
following, T. will denote the set of all complete target sentences that begin with the sequence of target 

55 words T. A complete sentence is a sequence that ends in a special end-of-sentence marker. 

In one embodiment, the translation match score Score(S. T» is an estimate of a conditional probability 
P(T-|S). while in another embodiment the translation match score is an estimate of a joint probabihty P- 
(ST-)- 
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In the latter embodiment the translation match score generator includes three components: 

1 . a language match score generator which computes an estimate P(T) of the prior probability of a target 
word sequence T; 

2. a conditional translation match score generator which computes an estimate P(S|T)of the conditional 
probability of a source word sequence S given a target word sequence. T; and 

3. a combined score generator which uses the language match score and the : conditional -translation 
match score to produce an estimate of a joint probability P(S, T« ). : 

The combined match score generator will now be described. In the prior art. language match scores 
and conditional translation match scores are combined only when the words of S are generated from the 
words T and no other words. In contrast, the combined match score generator must estimate a combined 
score when S is generated from the words of T together with some additional unspecified words. 

In one embodiment, this combined score is computed as a sum over alt complete sentence T* in T-: 



15 



25 



P(S,T*)= £ P(T-)P(S | T') m 



20 The probability P(T) is obtained from the language match score generator, and the probability P(S|T) is 
obtained from the conditional translation match score generator. 

In other embodiments, various approximations are made to simplify the computation of this sum. One 
such approximation is 



n 

P(S,T») = ^P(T« V )P(S|T« k ) [2] 
k - o 

Here T* k denotes the set of target sequences that begin with 'ST- and contain k additional words, n is a 
parameter specifying the maximum allowed number of additional words/ arid a is a special generic target 
word. For the specific embodiments of the language match score generator and the conditional translation 
match score generator described above, this approximation leads to the formula - ■• 

*-o v £3] 

ft (Z p* (s * 1 T * )pe(i * j ' 1} + kP7(S 1 1 a)P8(i 1 1} ) 

45 Here p7(s|cr) are average word translation probabilities, and paOlO are average alignment probabilities. Also 
paOdT) is the probability of the set of all complete sentences which begin with T and contain k additional 
words. In one embodiment this probability is estimated as 



50 Cs M if T is a complete -sentence] ^ 

Ps(k I T) = _ ^ otherwise ) 



55 where q is an estimate of the unigram probability of the end-of-sentence marker. 

The conditional translation match score generator will now be described. The task of the conditional 
translation match score generator is to compute a conditional translation score P(S|T) of a sequence S of 
source words given a sequence T of target words. 
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In one embodiment of a conditional translation match , score generator, the probability of S given T is 
computed as 

s ■p ( s|T)-p»(l|m)fl-2psCS.rlT,)p«(ilj:#l)- 

Here I is the length of S, m is the length of T, S,>' the i ,h word of S. and T, is the j th word of T. The 
to parameters of the model are: r., 

1. sequence length probabilities p+(l|m) satisfying, 

Ep (l|m) = 1; 
1 H 

15 - . . 

2. word translation probabilities p5<s|t) for source words s and target words t satisfying 

gp5(s|t) = 1; 



20 



25 



3. alignment probabilities pcOj.l) satisfying 

.w 

£P 6 (i|j>l) 7 1? 

Values for these parameters can be determined from a large quantity of aligned source-target sentence 
30 pairs (S 1 T 1 ) <S n T") using a procedure, that, is explained in detail in the above mentioned patent Briefly, 
this procedure works as follows. The pw^ability orthe. aligned sentence pairs'is a computable function of 
the parameter values. The goal of, the„prccedure r it to ,find. parameter values which locally maximize this 
function. This is accomplished iteratively. At each step. of. the' iteration, the parameter values are updated 
according to the formulas: 



35 



p s (s | t) - -— c 5 (s |t) ; p«( j 1 i , 1) - ~ j I i , U W 



40 

where 



45 



C 5 (s lt) = £c<srt; S n :T") ; -; C«(j|l,l) = £c<j|i; S^T*) 17] 



C 5 (s|t; S,T) = £ £*<S,S l )a(t,T J Mj,i,S,.T) ? c «(j|i; S,T) =«(j,i,S,T) [8] 
50 i - 1 j - I 

<r(j,i,S,T) = J ' ' ' ; /»(j,i,S # T) = p 5 (S i |T i )p 6 (j|i # l) [9] 

55 • . . ,f'v S * T > ( ..: ...... 
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Here the sum on n runs over all source-target sentence pairs <S n , T"). The normalization constants Xs and 
Xe are chosen so that the updated quantities are conditional probabilities. (See. European Patent Application 
92 111 725 5 entitled "Method and System For Natural Language Translation.") 

The language match score generator will now be described- The task of the language match score 
generator is to compute a score P(T) for a finite sequence T of target words. 

The probability P(T) of occurrence of the target word sequence may be approximated by the product of 
n-gram probabilities for all n-grams in each string. That is, the probability of a sequence of words may be 
approximated by the product of the conditional probabilities of each word in the string, given the occurrence 
of the n-1 words (or absence of words) preceding each word. For example, if n = 3, each tngram 
probability may represent the probability of occurrence of the third word in the trigram, given the 
occurrence of the first two words in the trigram. 

The conditional probabilities may be determined empirically by examining large bodies of text. For 
example, the conditional probability f (WjW x W y ) of word 

few. I w x w y ) = f ,(w,. l w,w y ) + A 2 f 2 (w r I vg + x y t 3 <w«) + i,f • [10] 



* where 
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30 



40 



45 



50 



n *y». ' r 1 11 

fi(w,|w x w y ) = - 1 r- 



35 f 4 =* . [14] 

and 

Xi + X2+Xa+X* = 1 [15] 



55 



In equations [11 H1*]. the count n^ is, the number of occurrences of the trigram W x W y W 2 in a large body 
of training text The count n^ is the number of -occurrences of therbigram W x W y in- the training text. 
Similarly. n« is the number of occurrences of the Digram W y W 2 in the- training text,n y is the number of 
occurrences of word W y , n z is the number of occurrences of word W z , and n is the total number of words in 
the training text The values of the coetfidehtsXi, Xa. Xa. and U in equations [10] and [15] may be 
estimated by the deleted interpolation method: (See. L.R. Bahl et al. March 1983. cited above.) 
, In a variation of the trigram language model, the probability P(T) is computed as , 

where 

P3(t3|tlt2) = X3(c)f 3 (t3|tlt2) + X2(C)f 2 (uit2) + X^Oflfe) + Xo(C) [17] 

with c = c(t,t2). Here m is the length of T and Tj is the j* word of T. The parameters^ the model are: 

8 
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1 conditional frequency distributions f 3 (fa|t,fc). f 2 (t3|t 2 ). ; fi(t 3 ). for target words t,. U.U; 

2 a bucketing scheme c which assigns word pairs t,t2 to a smali.number of classes; 

3 non-negative interpolation functions Xi(c),i = 0,1.2.3, which satisfy ;; . . 



EA.| = 1 



10 



5>, = i 



i 



Values for these parameters can be determined from a large quantity of training target text as described 

" ^Returning to Rgure 1. the speech recognition system comprises a hypothesis score generator 28 for 
generating a hypothesis score for each hypothesis. Each hypothesis score comprises a combmat.cn of the 
acoustic match score and the translation match score for the hypothesis. . 

The speech recognition system further comprises a storage device 30 for storing a subset of one o 

*, more speech hypotheses, from the set of speech hypotheses, having the best hypothesis scores. An output 
device ^outputs at least one word of one or more of the speech hypotheses in the subset of speech 

hypotheses having the best hypothesis scores. .. . m „ 

Rgure 3 is a block diagram of a portion of an example of a speech recognition system accordmg to the 
invention. In this embodiment of the invention, the system comprises a source vocabulary store 33 for 

* Soring the source language vocabulary. A comparator 34 compares each source tax, prov ded by 
source text input device 12 to each word in the source language vocabulary store 33 for the .purpose , of 
^ntifying each word in the source text that is not a word in the source language ^^T*"?££ 
regenerator 18 generates an acoustic model of at least one word in the source text which ,s not in the 

30 ^^e^Sor 34 may also construct, for each word in the source text that is not a word in the .source 
language votary, a sequence of characters that may be a transition of that «°rd '"to the target 
tangle, and place any such possible translations into the target language vocabulary (not shown) In one 
embodiment of the invention, this comparator may operate according to a stf ^f rules that ^cnbe the 
manner in which letters in the source language should be rewritten when translated ' nto * e ^9^X 
For examDle if the source language is French and the target language is Engbsh. then this set of rules 
" mTghTSe ^HTthe string of characters phobie shouki be rewritten as photo so »at*e French 
word hydrophobie is transformed into the English word hydrophobia . Other rules in such a system specrfy 
the dropping of accents from letters, or the modification of verbal endings. 

The comparator 34 may also identify words in the source text that beg.n wrth an uppercase letter I but 
*o do nrt afTpeWin the source language .vocabulary, and place them, into the target language vccabu^y. 
Refeir^aTn to the example of French as the source language and English as the target "anQuage rf the 
word Microsoft appears in the source text, but not in the.scurce language vocabulary then rt is added I to tte 
tog^linguige^uiry. Many proper names are missing from even large vocabulanes and yet are often 
translated directlv from one language to another with no change in spelling, 
„ ^rottn^diment of the Lfntion. «,e acoustic mode, generator 18 generates an acetic mode, o^a 
word by replacing each letter in the spelling of the word with an acoustic letter model, from an aooust c 
tetter model store- 35. corresponding to the letter: (See. for example. L.R. Bahl. et a . Automatic 
^teUnVtion of Pronunciation oT Words From Their Speliings" IBM Technical Disclosure Bu.letn Vo urn 
32 No 10B March 1990. pages 19-23; and J.M. Lucassen. et al. "An Information Theoretic Approach To 
so The Um^tic ^terminatton Of Phonemic Baseforms:- Proceedings of the.984 jmernat,ona. 
Conferen ce on Acoustics. Speech, and Signal Processing . Vol. 3. pages 42.5.1 -42^.4 March 1 984 ) 

in the speech recognition system according to the invention, the acoustic rnod^ger^rator 18 the 
acoustic matcTscore generator 24. the translation match score generator 26. the hypothesis generator 16. 
mT^Pomest ^re generator 28. and the comparator 34 may be made by programming a general 
55 P^or'pec^pur^se digital computer system. The scxirce text input device 12. the best hypothesis 
sTcTsO. the Source vocabulary store 33. and the acoustic tetter model store 35 may compose a computer 
memory such as a read on.y memory or a read/write memory. The output dev.ce 32 may be. for example 
Tdisptey such as a cathode ray tube or liquid crystal display, a printer, a loudspeaker, or a speech 
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synthesizer- . , 

Figure 4 is a block diagram of an. example of an acoustic processor 14 (Figure r ij'for a speech 
recognition apparatus according to the present invention. An. acoustic feature value measure 36 is provided 
for measuring the value of at least one feature of an utterance oyer each .of a series of successive time 
5 intervals to produce a series of feature vector signals representing the feature values. Table 1 illustrates a 
hypothetical series of one-dimension feature vector, signals corresponding to time intervals t1 , t2, t3, t4, and 
t5. respectively. ■ . 





TABLE 1 






time 


. tl t2 _ t.3 


t4 


t5 


Feature Value 


0.18 0.52 0.96 


0.61 


0.84 



75 

A prototype vector store 38 stores a plurality of prototype vector signals. Each prototype vector signal 
has at least one parameter value and has a unique identification value. 

Table 2 shows a hypothetical example of five prototype vectors signals having one parameter value 
so each, and having identification values P1 . P2, P3. P4, and P5, respectively. 



TABLE 2 

Prototype Vector 

Identification Value PI P2 P3 P4 P5 
Parameter Value 0.45 0.59 0.93 0,76 0.21 



A comparison processor 40 compares the closeness of the feature value of each feature vector signaled 
the parameter values of the prototype vector signals to obtain prototype match scores for each feature 
vector signal and each prototype vector signal. ... 

Table 3 illustrates a hypothetical example of prototype match scores for the feature vector signals of 
Table 1 , and the prototype vector signals of Table 2. 







TABLE 


3 


















Prototype 


Vector 


Match : 


Scores ; 


time 


tl 




t2 




t3 




t4 


t5 


Prototype Vector 






















Identification Value 






















PI 


0 


.27 


0 


.07 


0 


.51 


0 


.16 


0 


.39 


P2 


0 


.41 


0 


.07 


0 


.37 


0 


.02 


0 


.25 


P3 


0 


.75 


- 0 


.41 


0 


;03 


0 


.32 


0 


.09 


P4 


0 


.58 


:. 0 


.24 




0.2 


0 


.15 


0 


.08 


P5 


0 


.03 


0 


.31 


0 


.75 




0.4 


0 


.63 



In the hypothetical example, the feature vector signals and the prototype vector signal are shown as 
having one dimension only, with only one parameter value for that dimension. In practice, however, the 
feature vector signals and prototype vector signals may have, for example, fifty dimensions, where each 
dimension has two parameter values. The two parameter values of each dimension may be, for example, a 
55 mean value and a standard deviation (or variance) value. 

Still referring to Figure 4. the speech recognition and speech coding apparatus further comprise a rank 
score processor 42 for associating, for each feature vector signal, a first-rank score with the prototype 
vector signal having the best prototype match .score, and a second-rank score with the prototype vector 
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signal having the second best prototype match score. 

Preferably, the rank score processor 42 associates a rank score with all prototype vector signals for 
each feature vector signal. Each rank score represents' the estimated closeness of the associated prototype 
vector signal to the feature vector signal reiative to the estimated closeness of all other prototype vector 
s signals to the feature vector, signal. More specifically, 'the rank score for a selected prototype vector signal 
for a given feature vector signal is monotonically related to the number of other prototype vector signals 
having prototype match scores better than the prototype match score of the selected prototype vector 
signal for the given feature vector signal. 

Table 4 shows a hypothetical example of prototype . vector rank scores obtained from the prototype 

io match scores of Table 3. 







TABLE 4 
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Prototype 


Vector 


Rank 


Scores 




time ; : 


tl: 


v, . ,t2 ; 


t3. 


t4 






Prototype Vector 














Identification Value 














PI 


2 


* ' - . ■■ 1 ^ 


4 


3 


.4 


20 


P2 


3 


1 


3 


1 


3 




P3 


5 


. ... 5 . 


1 


4 


2 




P4 




3 


2 


2 


1 




P5 


i 


4 


5 


5 


5 



25 



35 



As shown in Tables 3 and 4, the prototype vector signal P5 has the best (in this case the closest) 
prototype match score with the feature vector signal at time tl and is therefore associated with the first-rank 
score of "t\ The prototype vector signal P1 has the second best prototype match score with the feature 
30 vector signal -at .time t1 v and the^or^ \s .?s^a^ with/the second-rank score of "2". Similarly, for the 
feature vector signal at time tl. prototype' vector "signals P2, P4, and P3 are ranked -3", "4" and "5" 
respectively. Thus, each rank score represents the estimated closeness of the associated prototype vector 
signal to the feature vector signal relative to the estimated closeness of all other prototype vector signals to 
the feature vector signal. * ' 

Alternatively, as shown in Table 5, it is sufficient that the rank score for a selected prototype vector 
signal for a given feature vector signal, is. mojiotonically.cejated to the number of other prototype vector 
signals having prototype match scores better than, the prototype match score of the selected prototype 
vector signal for the iven feature vector signal. Thus, for example, prototype vector signals P5. PI. P2. P4, 
and P3 could have been assigned rank scores of .?1.V"2r, 'r. "3" and "3-, respectively. In other words, 
the prototype vector signals can be ranked either irKfividualiy. or Jn groups. 



40 



45 



SO 



55 



TABLE 5 



Prototype Vector Rank Scores (alternative) 



time v . 
Prototype Vector 
Identification Value 

PI 

P2 

P3 

P4 

P5 



tl 



2 
3 
3 
3 
1 



t2 



1 
1 
3 
3 
3 



t3 



3 
3 
1 
2 
3 



t4 



3 
1 
3 
2 
3 



t5 



3 
3 
2 
1 
3 



In addition to producing the rank scores, rank score processor 42 outputs, for each feature vector 
signal, at toast the identification value and the rank score of the first-ranked prototype vector signal, and the 
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identification value and the rank score of the second-ranked prototype vector signal, as a coded utterance 
representation signal of the feature vector signal, to produce a series of coded utterance representation 

signals- 
One example of an acoustic feature value measure is shown in Figure 5. The measuring means 

5 includes a microphone 44 for generating an analog electrical signal corresponding to the utterance. The 
analog electrical signal from microphone 44 is converted to a digital electrical signal by analog to digital 
converter 46. For this purpose, the analog signal may be sampled, for example, at a rate of twenty kilohertz 
by the analog to digital converter 46. 

A window generator 48 obtains, for example, a twenty millisecond duration sample of the digital signal 

w from analog to digital converter 46 every ten milliseconds (one centisecond). Each twenty millisecond 
sample of the digital signal is analyzed by spectrum analyzer 50 in order to obtain the amplitude of the 
digital signal sample in each of. for example, twenty frequency bands. Preferably, spectrum analyzer 50 
also generates a twenty-first dimension signal representing the total amplitude or total power of the. twenty 
millisecond digital signal sample. The spectrum analyzer 50 may be, for example, a fast Fourier transform 

is processor. Alternatively, it may be a bank of twenty band pass filters. 

The twenty-one dimension vector signals produced by spectrum analyzer 50 may - be adapted to 
remove background noise by an adaptive noise cancellation processor 52. Noise cancellation processor 52 
subtracts a noise vector N(t) from the feature vector F(t) input into the noise cancellation processor to 
produce an output feature vector F(t). The noise cancellation processor 52 adapts to changing noise levels 

20 by periodically updating the noise vector N(t) whenever the prior feature vector F(M) is identified as noise 
or silence. The noise vector N(t) is updated according to the formula 

N(t) = N(t-1) + kfF(t-1 ) - Fp(t-1 )]. [18] 

25 where N(t) is the noise vector at time t. N(t-1) is the noise vector at time (t-1). k is a fixed parameter of the 
adaptive noise cancellation model. F(t-1) is the feature vector output from the noise cancellation processor 
52 at time (t-1) and which represents noise or silence, and Fp(t-1) is one silence or noise prototype vector, 
from store 54. closest to feature vector F(t-1). ~ , 
The prior feature vector F(t-1) is recognized as noise or silence if either (a), the total energy of the 

30 vector is below a threshold, or (b) the closest prototype vector in adaptation prototype vector store 56 to the 
feature vector is a prototype representing noise or silence. For the purpose of the analysis of the total 
energy of the feature vector, the threshold may be. for example, the .fifth percentile of all feature .vectors 
(corresponding to both speech and silence) produced in the two seconds prior to the feature vector being 

evaluated. . ' ' . , • ^ " " ' . . ' , 

35 After noise cancellation, the feature vectbr F(t) is normalized to adjust for vanations in the loudness ot 
the input speech by short term mean rtormalization processor 58. Normalization processor 58 normalizes 
the twenty-one dimension feature vector F(t) to produce a twenty dimension normalized feature vector X(t). 
The twenty-first dimension of the feature vector F(t),: representing, the total amplitude or total .power, is 
discarded. Each component i of the normalized feature vector X(t) at time t may. for example, be given by . 
40 the equation i V' 

xKt) = FKt) - z(t) [19] Vv ,: .f .':V/' : "".'"/V' 

in the logarithmic domain, where FKt) is the i-th component of the un normalized vector at time t and where 
45 Z(t) is a weighted mean of the components of F(t) and Z<t - 1) according to Equations 20 and 21 : 

z(t) = o.9Z(M) + o.iM(t) [201 "'./'" 

and where 



so 




M < t ) = ^7r 7 r-i(t) [21] 



55 



The normalized twenty dimension feature vector X(t) may be further processed by an adaptive labeler 60 to 
adapt to variations in pronunciation of speech sounds. An adapted twenty dimension feature vector X*(t) is 
generated by subtracting a twenty dimension adaptation vector. A(t) from the twenty dimension feature 
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10 



vector X(t) provided' to^me input of the adaptive 'labeler 60. The adaptation, vector A(t) at time t may. for 
example, be given by the formula 

A(t) = A(t-1) + k[X\t-1) - Xp(t-1)], [22] , • 

where k is a fixed parameter of the adaptive' labeling : mode., X-(M> is the normalized t^enty^rnensior, 
ve*£ ~Iu\ from ttTe adaptive labeler 60 at time (t-1). Xp(M) is the adaptation prototype vector (from 
awSoto^e store 4 closest to the twenty dimension feature vector >C(,-1> a. time and A(t-1) 

iS *V^EZ£!~£ feature vector -signal X-(t> from the adaptive labeler 60 is preferably 
prov^r^auLry mode, 62. Auditory mode, 62 may. for example,. provide a -ode' of ^ the h = 
auditory system perceives sound signals. An example of an auditory model .s described ,n ^ 
4^918 to Bahl et al entitled "Speech Recognition System with Efficient Storage and Rap,d Assembly of 

^Sy^ac^rding to the present invention, for, each. frequency band i of the adapted feature Wtor 
signaT )C(t) at iime t the auditory model 62 calculates a new parameter E,(t) accord.ng to Equations 23 and 



20 



24: 

E,(t) = Ki + K a (X , i (t))(Nj(t-1)) [23] 
where 

N ( (t) = K3 x Nj(t-1) - E ( (t-1) [24] 



25 



30 



35 



40 



45 



50 



an H wh«fl K, and Ks are fixed parameters of the auditory model. - . 

£Ta£ ce^nd time inte^I. the output of the auditory mode, 62 is a modified twenty dimension 
feature veX signai. This feature vector is augmented by a twenty.first dimensron havmg a value equal to 
the souare root of the sum of the squares of the values of the other twenty dimensions. 

^e~r? cenSeco^l time^ntervalv a concatenatpr.,64 preferably concatenates nine twenty-one 
JZ^t^^ '^^ *° «» current c^ntisecond time interval. 
c^seTnd time intervals? e^ thelfourfollowirtg.centisecond. tirne intervals to form a ^«t* C8d ^ 
oTTanmehsZs Each^89 dimension^pH^ vecJor.is.preferably multiplied in a rotator 66 by a ration 
matrix to rotate the spliced vector and to reduce the spliced vector to fifty d.mensions. 

tU rotetoon matrix used in rotator 66 may *e obtained, for example^ '^^^^^^ 
of 189dimension spliced vectors obtained during a Jtraining^sipn. The .nverse of the ~vanancei matrix 
t an vectors in the training set is multiplied. ^W™^^^™™^*'"^ 

tr^^soHcecl vectors in all M classes. 3t»:**mJ#*W*m,<* •« h »,.«~«* ,n » matnx 'T? .•»«* ab0 " . 

to exalte "Vector Quantization Procedure For Speech Recognition Systems Us.ng Discrete 
^e^Ph^er^ed Markov Word Mode*" by L.R. Bahl. et al. ,BM Technica. Disclosure Bu.letn. 

^^T^l^^^^ nc.se cancellation ^ssor 5^ r, tern , mean 
normaJiz^on processor 58. adaptive labeler 60. auditory model 62. concatenator 64. and rotator 66. may be 
sS^r^mTsUcial Purpose or genera, purpose digital^ signal processors. Prototype stores 54 and 
S6 mav be electronic computer memory of the types discussed above. 

^ wotSvoe ^cWs in prototype store 38 may be obtained, for example, by clustering feature vector 
signal a^nSS nto a pSty of clusters, and then ca,cu.ating the mean, and 
to ea^ >Z*«to torn, the parameter values of the prototype vector. When the training script ~™P'^* a 
SrieTof wo^mZ models (forming a model of a series of words), and each word-s^ment model 
Borises a seteTof elementary models having specified locations in the word-segment models, the 
S^or^ may b^ c, u «ered by specifying that each Custer corresponds toa s ngle elemen ary 
mtSefln^s n^le Nation In a single word-segment model. Such a method is descnbed .n more deta,l in 
EufopUn Pate^CLtion 92 1u8 483.6. entitled "Fas, Algorithm for Deriving Acoustic Prototypes for 

^rarrST ^re vectors generated by the utterance of a training text and njJW. 
corre^oTto a given elementary model may be clustered by K-means Euclidean clustering ° ^«ans 
STclusteri^g. or both. Such a method is described, for example, in European Patent Application 91 
121 180.3 entitled "Speaker-Independent Label Coding Apparatus". 
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Claims . . , . . - . .:. - . _ • . .... 

1. A speech recognition system comprising: 

means for displaying a source text comprising one or more words in a source language; 
s an acoustic processor for generating a sequence of coded representations of an utterance to be 

recognized, said utterance comprising a series of one or more words in a target language different from 
the source language; 

means for generating a set of one or more speech hypotheses, each speech hypothesis compnsing 
one or more words from the target language; 
w means for generating an acoustic model of each speech hypothesis; 

means for generating an acoustic match score for each speech hypothesis, each acoustic match score 
comprising an estimate of the closeness of a match between the acoustic model of the speech, 
hypothesis and the sequence of coded representations of the utterance; 

means for generating a translation match score for each speech hypothesis, each translation match 
is score comprising an estimate of the probability of occurrence of the speech hypothesis given the 

occurrence of the source text; 

means for generating a hypothesis score for each hypothesis; each hypothesis score comprising a 
combination of the acoustic match score and the translation match score for the hypothesis; 
means for storing a subset of one or more speech hypotheses, from the set of speech hypotheses. 
20 having the best hypothesis scores; and 

means for outputting at least one word of one or more of the speech hypotheses in the subset of 
speech hypotheses having the best hypothesis scores. 



25 



30 



35 4. 



A speech recognition system as claimed in Claim 1 , characterized in that: 

the system further comprises means for generating a set of candidate words consisting solely of words 
in the target language which are partial or full translations of words in the source text; and 
the speech hypothesis generator generates one or more speech hypotheses solely from words in the 
set of candidate words. 

A speech recognition system as claimed in Claim i; characterized' in that the translation match score 
- for a speech hypothesis comprises an estimate of the probability of occurrence of the source text given 
the occurrence of the speech hypothesis combined with an estimate of the probability of occurrence of 
the speech hypothesis. . 

A speech recognition system as claimed in 6laim 3,' characterized in that the probability of occurrence 
of the source text given the occurrence^ the speech* hypothesis comprises an estimate, for each word 
in the source text, of the probability of the word in the source text given the occurrence of each word in 
the speech hypothesis- j- . . . • r '*.... 

40 5. A speech recognition system as claimed in Claim 1, characterized in that the acoustic match score 
comprises an estimate of the probability of occurrence of the sequence of coded representations of the 
utterance given the occurrence of the speech hypothesis. 

6. A speech recognition system as claimed in /Claim 5, characterized in that the hypothesis score 
45 comprises the product of the acoustic match score multiplied by the translation match score. 

7. A speech recognition system as claimed in Claim 1 . further comprising: 
means for storing a source vocabulary of words in the source language; 

means for comparing each word in the source text with each word in the source vocabulary to identify 
so each word in the source text which is not in the source vocabulary; and 

means for generating an acoustic model of at least one word in the source text which is not in the 
source vocabulary. 

8. A speech recognition system as claimed in Claim 7, characterized in that: 
each word in the source text has a spelling comprising one or more tetters, each letter being upper 
case or being lower case; . , . , ' 

the system further comprises means for identifying each word in the source text which has an upper 
case first letter; and 
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the means for generating an acoustic model generates an acoustic model of each word in the source 
text which is not in the source vocabulary, and which has an upper case first letter. 

9. A speech recognition system, as claimed in .Claim 7. characterized in that the means for generating an 

acoustic model comprises: , . . . : . • 

means for storing a plurality of acoustic letter models; and 1 
mearls for generating an acoustic model, of,, word by. replacing each letter in the spelhng of the word 

with an acoustic letter model corresponding to the letter. 

10. A speech recognition system as claimed in Claim 1; characterized in that the output means comprises 
a display.; ■* . .. 

11. A' speech recognition system as claimed 'in Claim 'lO, characterized in that' the display comprises a 
cathode ray tube... ... . ................. 

12. A speech recognition system . as claimed in Claim 10, characterized in that the display comprises a 

liquid crystal display. . - . . . 

13. A speech recognition system as claimed in Claim 1. characterized in that the output means comprises 

a printer : 

14. A speech recognition system as claimed in Claim 1, characterized in that the output means comprises 
a loudspeaker. . •. : v . - ■ 

15. A speech recognition system as claimed, iri Claim 1. characterized in that the output means comprises 

a speech synthesizer. t 

16. A speech recognition system as claimed in Claim 1. characterized in that the means for storing speech 
hypotheses comprises readable computer memory. ^ r 

17. A speech recogniijon sysiem' as claimed in' Claim .i; characterized in that the acoustic processor 

meansTc^ measuring the value of at least one feature of an utterance over each of ••*•«<* 
successive time intervals to produce, a, series of. feature vector signals representing the feature values, 
means for storing a. plurality, of Optotype vector signals.each prototype vector signal havng at least 
one parameter value and having a.uniflue |ftag^^i«ori value; ^^.^o, 
means for comparing the closeness of the feature value of a first feature vector s,gnal to the paramete 
values of Z prototype vector signals to obtain prototype match scores for the first feature vector s,gnal 

and each prototype vector signal; . ^ v . ...... , . _ . t . ^ 

ranking means for associating , a firaHwk", vx*^ Iha-.praWypa vector signal hav.ng the tost 
prototype match score, and for associating a second-rank score with the prototype vector s,gnal havng 
the second best prototype match score; and . ^ t * ^^.^ 

me^Vfor outputting atTeast the identification value and the rank score of the fir^-rank^ prototype 
S slgrialVml ^identification value and the rank, score of the second-ranked prototype vector 
signal, as a coded utterance representation signal of the first feature vector signal. 

18. A speech recognition system as claimed inClaim 17, characterized in that the means tor measuring the 
value of at least one feature of an utterance comprises a microphone. 

i 19. A speech recognition method comprising: 

displaying a source text comprising one or more words in a source language; 

generating a sequence of coded representations of an utterance to be recogmzed. sa,d utterance 
comprising a series of one or more words in a target language different from the source language; 
. generating a set of one or more speech hypotheses, each speech hypothesis comprising one or more 
5 words from the target language; 

generating an acoustic model of each speech hypothesis; , . . 

generating an acoustic match score for each speech hypothesis, each acousfc match score compnsmg 
an estimate of the closeness of a match between the acoustic model of the speech hypothecs and the 
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sequence of coded representations of the utterance; 

generating a translation match score for each speech hypothesis, each trans latfb'n match score 
comprising an estimate of the probability of occurrence of the speech hypothesis given the occurrence 
of the source text; 

s generating a hypothesis score for each hypothesis, each hypothesis score comprising a combination of 

the acoustic match score and the translation match score for the hypothesis; 

storing a subset of one or more speech hypotheses, from the set of speech hypotheses, having the 
best hypothesis scores; and • . 

outputting at least one word of one or more of the speech hypotheses in the subset of speech 
to hypotheses having the best hypothesis scores. 

20. A speech recognition method as claimed in Claim 19, characterized in that: 

the method further comprises the step of generating a set of candidate words consisting solely of 
words in the target ianguage which are partial or full translations of words in the source text; and 
js the step of generating speech hypotheses generates one or more speech hypotheses solely from 

words in the set of candidate words. 

A speech recognition method as claimed in Claim 19, characterized in that the translation match score 
for a speech hypothesis comprises an estimate of the probability of occurrence of the source text given 
the occurrence of the speech hypothesis combined with an estimate of the probability of occurrence of 
the speech hypothesis. 

22. A speech recognition method as claimed in Claim 21, characterized in that the probability of 
occurrence of the source text given the occurrence of the speech hypothesis comprises an estimate, 
25 for each word in the source text, of the probability of the word in the source text given the occurrence 
of each word in the speech hypothesis. 

A speech, recognition method as claimed in Claim 19, characterized in that the acoustic match score 
comprises an estimate of the probability of occurrence of the sequence of coded representations of the 
utterance given the occurrence of the speech hypothesis/ *- ' 

A speech recognition method as claimed in Claim 23,' characterized in ; that the hypothesis score 
comprises the product of the acoustic match score multiplied by the translation match score: 

35 25. A speech recognition method as claimed in Claim 19, further comprising the steps of: 
storing a source vocabulary of words in the soiirce language; - - ' 

comparing each word in the source text with each word in the source vocabulary to identify each word 
in the source text which is not in the source vocabulary; and 

generating an acoustic model of at least one word in the source text which is not in the source 
40 vocabulary. 

26. A speech recognition method as claimed in Claim 25, characterized in that: 

each word in the source text has a spelling comprising one or more letters, each letter being upper 
case or being lower case; 

45 the method further comprises the step of identifying each word in the source text which has an upper 
case first letter; and 

the step of generating an acoustic model comprises generating an acoustic model of each word in the 
source text which is not in the source vocabulary, and which has an upper case first letter. 

so 27. A speech recognition method as claimed in Claim 25, characterized in that the step of generating an 
acoustic model comprises: 
storing a plurality of acoustic letter models; and 

generating an acoustic model of a word by replacing each letter in the spelling of the word with an 
acoustic letter model corresponding to the letter. 

55 

28. A speech recognition method as claimed in Claim 19, characterized in that the output means comprises 
a display. 
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29. A speech recognition method as claimed in Claim 28, characterized in that the display comprises a 
cathode ray tube. , • , t , 

30. A:.speech recognition method as claimed . m Claim 28, characterized in that the display comprises a 
liquid crystal display. . 

31. A speech recognition method as claimed in Claim 19, characterized in that the output means comprises 
a printer. . « 

32. A speech recognition method as claimed in Claim 19. characterized in that the output means comprises 
a loudspeaker. . . - ■ , 

33. A speech recognition method as claimed in Claim 1 ^characterized in that the output means comprises 
a speech synthesizer. ; . , ■ 

34. A speech recognition method as claimed in Claim 19. characterized in that the means for storing 
speech hypotheses comprises readable computer memory. 

35. A speech recognition' method as, claimed,. in ! Claim 19. characterized in that the acoustic processor 

20 comprises: ~ . , 

means for measuring the value of at least one feature of an utterance over each of a ser.es of 
successive time intervals to produce a series of feature vector signals representing the feature values; 
means for storing a plurality of prototype vector signals, each prototype vector signal having at least 
one parameter value and having a unique identification value; 1 
means for comparing the closeness of the feature value of a first feature vector signal to the parameter 
values of the prototype vector signals to obtain prototype match scores for the first feature vector signal 
and each prototype vector signal; . . L. . . - ■ i . ^ 

ranking means for associating a first-rank , score, with, the prototype vector signal having the best 
prototype match score, and for associating a second-rank score y/ith the prototype vector signal having 
30 the second best prototype match score; and . . ^ o 

means for outputting at least the- identification value and the rank score of the first-ranked prototype 
vector signal, and the identification .yaiue and. the rank score\of the second-ranked prototype^vector 
signal, as a coded utterance representation signal of the first feature vector signal. 



f5 
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35 36. A speech recognition method as claimed /in Claim 35, /characterized in that the means for measuring 
the value of at least one feature of an- utterance comprises a microphone. 
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